IMU Experiment in IR4QA at NTCIR-8

نویسندگان

  • Xiangdong Su
  • Xueliang Yan
  • Guanglai Gao
  • Hongxi Wei
چکیده

This paper describes our work in the subtask IR4QA. Our IR system designed for this task consists of two modules: (1) query processing; (2) indexing, retrieval and re-rank. We first study the method of question classification, and the strategies of weighting based on the result of question classification. Baidu and Wanfang resources are exploited to help query expansion. Through studying the specialty of each index formats and each index unit, we create three indexes of different types: KeyFile-UnigramIndex, KeyFile-Word-Index and Indri-Word-Index. Then we use an interpolating method to re-rank the documents returned from the above three indexes. Our system achieved 0.4266 mean AP, 0.4628 mean Q and 0.6761 mean nDCG in the final evaluation, giving a strong proof of the effectiveness of our approach.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Statistical Machine Translation based Passage Retrieval - Experiment at NTCIR-7 IR4QA Task

In this paper, we apply the statistical machine translation based passage retrieval, which was proposed at the last NTCIR-6 CLQA subtask, to the IR4QA Task. The experimental evaluation shows that the method is more effective for the relation and event type questions, which are longer and including relatively mane common keywords, than the definition and biography type questions, which are short...

متن کامل

Wikipedia Article Content Based Query Expansion in IR4QA System

This paper describes the work of our WUST group in NTCIR-8 on the subtask of English to Simplified Chinese and Simplified Chinese to Simplified Chinese information retrieval for question answering (EN-CS and CS-CS IR4QA). In order to enhance the precision and efficiency in question analysis, we employ a special question analysis method extracting more appropriate key terms and apply the query e...

متن کامل

Are Popular Documents More Likely To Be Relevant? A Dive into the ACLIA IR4QA Pools

The ACLIA IR4QA Task at NTCIR-7 is an ad hoc document retrieval task involving three document languages. Although IR4QA used pooling for collecting relevance assessments, it was unique in that the pooled documents were sorted before presenting them to the assessors, based on the assumption that “popular” documents are more likely to be relevant than others. We show that this assumption is indee...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010